Goto

Collaborating Authors

 unsupervised method


Multi-body SE(3) Equivariance for Unsupervised Rigid Segmentation and Motion Estimation (Supplementary Material)

Neural Information Processing Systems

Differently, our unsupervised multi-body task requires the model's ability to handle part-level local equivariance, Figure 1: Structure of our feature extractor based on EPN. "EPNConv" is the SE(3)-equivariant convolution proposed in the vanilla EPN network. Part-level SE(3)-equivariance is desirable for motion analysis, especially rotation estimation. Song and Y ang utilized the methodology proposed by Choy et al . All other objects were considered part of the background.


NeuralGF: Unsupervised Point Normal Estimation by Learning Neural Gradient Function -- Supplementary Material -- Qing Li

Neural Information Processing Systems

We provide optimization time ( i. e ., training time in the bracket) and inference time of our method. Our method improves the state-of-the-art results while using much fewer parameters. The surfaces are reconstructed from point clouds with low noise (a) and high noise (b). Fig 2, we show the reconstructed surfaces on point clouds with different noise levels. A partially enlarged view is provided for each shape.


0aa800df4298539770b57824afc77a89-Supplemental-Conference.pdf

Neural Information Processing Systems

For all datasets, we used standard normalization that scales the features to have zero mean and standard deviation of one. The architecture of the autoencoder consists of one hidden layer with sigmoid activation. A linear activation is used for the output layer. We use a hidden layer of 200 neurons for all datasets. We trained each dataset for 10 epochs using stochastic gradient descent with a momentum of 0.9 and a batch size of 128.


Adaptive Parameter Optimization for Robust Remote Photoplethysmography

Morales, Cecilia G., Teh, Fanurs Chi En, Li, Kai, Agrawal, Pushpak, Dubrawski, Artur

arXiv.org Artificial Intelligence

Remote photoplethysmography (rPPG) enables contactless vital sign monitoring using standard RGB cameras. However, existing methods rely on fixed parameters optimized for particular lighting conditions and camera setups, limiting adaptability to diverse deployment environments. This paper introduces the Projection-based Robust Signal Mixing (PRISM) algorithm, a training-free method that jointly optimizes photometric detrending and color mixing through online parameter adaptation based on signal quality assessment. PRISM achieves state-of-the-art performance among unsupervised methods, with MAE of 0.77 bpm on PURE and 0.66 bpm on UBFC-rPPG, and accuracy of 97.3\% and 97.5\% respectively at a 5 bpm threshold. Statistical analysis confirms PRISM performs equivalently to leading supervised methods ($p > 0.2$), while maintaining real-time CPU performance without training. This validates that adaptive time series optimization significantly improves rPPG across diverse conditions.


We thank the reviewers for their constructive reviews which clearly show that all reviewers have thoroughly read the pa-1

Neural Information Processing Systems

In our comments we address the reviews in the order of the reviewer number. The reviewer notes that on page 4 line 152, the phrase "no limitation is needed on the receptive field size" would benefit DGCNN than to PointNet as it also uses graph convolutions. The FoldingNet decoder alone has 1.05M parameters. Unfortunately the parameters count of the encoder are not stated, but the code indicates around 0.65M parameters The reviewer points out that some additional ablation study might be beneficial.



reviewers; we will make sure to update our manuscript accordingly. 2 1 Comparison with Other Unsupervised Methods (R1)

Neural Information Processing Systems

We would like to thank the reviewers for their comments and suggestions. In particular, TimeNet is a seq2seq method relying on an antoencoding loss and using LSTMs as encoder and decoder. TimeNet, and notably do not scale to long time series (as explained on lines 144-157), unlike ours. However, we did perform experiments on some datasets with different loss variants. We will add insights on this matter to the paper.




0ed9422357395a0d4879191c66f4faa2-AuthorFeedback.pdf

Neural Information Processing Systems

There has been a misunderstanding! The underlying network is setup to estimate the image from a single measurement. But for evaluation on the test set, our network is given only one measurement as input. Thus, all methods are provided identical inputs (one measurement, no overlap) that matches the noted ratios for CS. D-AMP estimation: they train denoiser networks for use in unrolled AMP iterations for CS recovery.